35 research outputs found

    Gene Ontology Function prediction in Mollicutes using Protein-Protein Association Networks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many complex systems can be represented and analysed as networks. The recent availability of large-scale datasets, has made it possible to elucidate some of the organisational principles and rules that govern their function, robustness and evolution. However, one of the main limitations in using protein-protein interactions for function prediction is the availability of interaction data, especially for Mollicutes. If we could harness predicted interactions, such as those from a Protein-Protein Association Networks (PPAN), combining several protein-protein network function-inference methods with semantic similarity calculations, the use of protein-protein interactions for functional inference in this species would become more potentially useful.</p> <p>Results</p> <p>In this work we show that using PPAN data combined with other approximations, such as functional module detection, orthology exploitation methods and Gene Ontology (GO)-based information measures helps to predict protein function in <it>Mycoplasma genitalium</it>.</p> <p>Conclusions</p> <p>To our knowledge, the proposed method is the first that combines functional module detection among species, exploiting an orthology procedure and using information theory-based GO semantic similarity in PPAN of the <it>Mycoplasma </it>species. The results of an evaluation show a higher recall than previously reported methods that focused on only one organism network.</p

    DockAnalyse : an application for the analysis of protein-protein interactions

    Get PDF
    Background: Is it possible to identify what the best solution of a docking program is? The usual answer to this question is the highest score solution, but interactions between proteins are dynamic processes, and many times the interaction regions are wide enough to permit protein-protein interactions with different orientations and/or interaction energies. In some cases, as in a multimeric protein complex, several interaction regions are possible among the monomers. These dynamic processes involve interactions with surface displacements between the proteins to finally achieve the functional configuration of the protein complex. Consequently, there is not a static and single solution for the interaction between proteins, but there are several important configurations that also have to be analyzed. Results: To extract those representative solutions from the docking output datafile, we have developed an unsupervised and automatic clustering application, named DockAnalyse. This application is based on the already existing DBscan clustering method, which searches for continuities among the clusters generated by the docking output data representation. The DBscan clustering method is very robust and, moreover, solves some of the inconsistency problems of the classical clustering methods like, for example, the treatment of outliers and the dependence of the previously defined number of clusters. Conclusions: DockAnalyse makes the interpretation of the docking solutions through graphical and visual representations easier by guiding the user to find the representative solutions. We have applied our new approach to analyze several protein interactions and model the dynamic protein interaction behavior of a protein complex. DockAnalyse might also be used to describe interaction regions between proteins and, therefore, guide future flexible dockings. The application (implemented in the R package) is accessible

    Bioinformatics and Moonlighting Proteins

    Get PDF
    Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyse and describe several approaches that use sequences, structures, interactomics and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are: a) remote homology searches using Psi-Blast, b) detection of functional motifs and domains, c) analysis of data from protein-protein interaction databases (PPIs), d) match the query protein sequence to 3D databases (i.e., algorithms as PISITE), e) mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs) have the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations –it requires the existence of multialigned family protein sequences - but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/), previously published by our group, has been used as a benchmark for the all of the analyses

    MultitaskProtDB: a database of multitasking proteins

    Get PDF
    We have compiled MultitaskProtDB, available online at http://wallace.uab.es/multitask, to provide a repository where the many multitasking proteins found in the literature can be stored. Multitasking or moonlighting is the capability of some proteins to execute two or more biological functions. Usually, multitasking proteins are experimentally revealed by serendipity. This ability of proteins to perform multitasking functions helps us to understand one of the ways used by cells to perform many complex functions with a limited number of genes. Even so, the study of this phenomenon is complex because, among other things, there is no database of moonlighting proteins. The existence of such a tool facilitates the collection and dissemination of these important data. This work reports the database, MultitaskProtDB, which is designed as a friendly user web page containing >288 multitasking proteins with their NCBI and UniProt accession numbers, canonical and additional biological functions, monomeric/oligomeric states, PDB codes when available and bibliographic references. This database also serves to gain insight into some characteristics of multitasking proteins such as frequencies of the different pairs of functions, phylogenetic conservation and so forth.Ministerio de Ciencia y Tecnología de Espanya [BIO2007-67904-C02-01, BFU2010-22209-C02-01]; Centre de Referència de R+D de Biotecnologia de la Generalitat de Catalunya; La Marató de TV3 [101930/31/32/33]; Comisión Coordinadora del Interior de Uruguay. The English of this manuscript has been corrected by Ms Lynn Strother. Funding for open access charge: [BIO2007-67904-C02-01 and BFU2010-22209-C02-01]

    Can bioinformatics help in the identification of moonlighting proteins?

    Get PDF
    Protein multitasking or moonlighting is the capability of certain proteins to execute two or more unique biological functions. This ability to perform moonlighting functions helps us to understand one of the ways used by cells to perform many complex functions with a limited number of genes. Usually, moonlighting proteins are revealed experimentally by serendipity, and the proteins described probably represent just the tip of the iceberg. It would be helpful if bioinformatics could predict protein multifunctionality, especially because of the large amounts of sequences coming from genome projects. In the present article, we describe several approaches that use sequences, structures, interactomics and current bioinformatics algorithms and programs to try to overcome this problem. The sequence analysis has been performed: (i) by remote homology searches using PSI-BLAST, (ii) by the detection of functionalmotifs, and (iii) by the co-evolutionary relationship between amino acids. Programs designed to identify functional motifs/domains are basically oriented to detect the main function, but usually fail in the detection of secondary ones. Remote homology searches such as PSI-BLAST seem to be more versatile in this task, and it is a good complement for the information obtained from protein-protein interaction (PPI) databases. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can be used only in very restricted situations, but can suggest how the evolutionary process of the acquisition of the second function took plac

    Multifunctional Proteins : Involvement in Human Diseases and Targets of Current Drugs

    Get PDF
    Multifunctionality or multitasking is the capability of some proteins to execute two or more biochemical functions. The objective of this work is to explore the relationship between multifunctional proteins, human diseases and drug targeting. The analysis of the proportion of multitasking proteins from the MultitaskProtDB-II database shows that 78% of the proteins analyzed are involved in human diseases. This percentage is much higher than the 17.9% found in human proteins in general. A similar analysis using drug target databases shows that 48% of these analyzed human multitasking proteins are targets of current drugs, while only 9.8% of the human proteins present in UniProt are specified as drug targets. In almost 50% of these proteins, both the canonical and moonlighting functions are related to the molecular basis of the disease. A procedure to identify multifunctional proteins from disease databases and a method to structurally map the canonical and moonlighting functions of the protein have also been proposed here. Both of the previous percentages suggest that multitasking is not a rare phenomenon in proteins causing human diseases, and that their detailed study might explain some collateral drug effects

    Do protein–protein interaction databases identify moonlighting proteins?

    Get PDF
    One of the most striking results of the human (and mammalian) genomes is the low number of protein-coding genes. To-date, the main molecular mechanism to increase the number of different protein isoforms and functions is alternative splicing. However, a less-known way to increase the number of protein functions is the existence of multifunctional, multitask, or ‘‘moonlighting’’, proteins. By and large, moonlighting proteins are experimentally disclosed by serendipity. Proteomics is becoming one of the very active areas of biomedical research, which permits researchers to identify previously unseen connections among proteins and pathways. In principle, protein–protein interaction (PPI) databases should contain information on moonlighting proteins and could provide suggestions to further analysis in order to prove the multifunctionality. As far as we know, nobody has verified whether PPI databases actually disclose moonlighting proteins. In the present work we check whether well-established moonlighting proteins present in PPI databases connect with their known partners and, therefore, a careful inspection of these databases could help to suggest their different functions. The results of our research suggest that PPI databases could be a valuable tool to suggest multifunctionality

    A hypothesis explaining why so many pathogen virulence proteins are moonlighting proteins

    Get PDF
    Moonlighting or multitasking proteins refer to those proteins with two or more functions performed by a single polypeptide chain. Proteins that belong to key ancestral functions and metabolic pathways such as primary metabolism typically exhibit moonlighting phenomenon. We have collected 698 moonlighting proteins in MultitaskProtDB-II database. A survey shows that 25% of the proteins of the database correspond to moonlighting functions related to pathogens virulence activity. Why is the canonical function of these virulence proteins mainly from ancestral key biological functions (especially of primary metabolism)? Our hypothesis is that these proteins present a high conservation between the pathogen protein and the host counterparts. Therefore, the host immune system will not elicit protective antibodies against pathogen proteins. The fact of sharing epitopes with host proteins (known as epitope mimicry) might be the cause of autoimmune diseases. Although many pathogen proteins can be antigenic, only a few of them would elicit a protective immune response. This would also explain the lack of successful vaccines based in these conserved moonlighting proteins. This review looks at why so many pathogen virulence proteins are from the primary metabolism and are conserved between pathogen and host

    MultitaskProtDB-II : an update of a database of multitasking/moonlighting proteins

    Get PDF
    Multitasking, or moonlighting, is the capability of some proteins to execute two or more biological functions. MultitaskProtDB-II is a database of multifunctional proteins that has been updated. In the previous version, the information contained was: NCBI and UniProt accession numbers, canonical and additional biological functions, organism, monomeric/oligomeric states, PDB codes and bibliographic references. In the present update, the number of entries has been increased from 288 to 694 moonlighting proteins. MultitaskProtDB-II is continually being curated and updated. The new database also contains the following information: GO descriptors for the canonical and moonlighting functions, three-dimensional structure (for those proteins lacking PDB structure, a model was made using Itasser and Phyre), the involvement of the proteins in human diseases (78% of human moonlighting proteins) and whether the protein is a target of a current drug (48% of human moonlighting proteins). These numbers highlight the importance of these proteins for the analysis and explanation of human diseases and target-directed drug design. Moreover, 25% of the proteins of the database are involved in virulence of pathogenic microorganisms, largely in the mechanism of adhesion to the host. This highlights their importance for the mechanism of microorganism infection and vaccine design. MultitaskProtDB-II is available at http://wallace.uab.es/multitaskII

    Pathogen Proteins Eliciting Antibodies Do Not Share Epitopes with Host Proteins: A Bioinformatics Approach

    Get PDF
    The best way to prevent diseases caused by pathogens is by the use of vaccines. The advent of genomics enables genome-wide searches of new vaccine candidates, called reverse vaccinology. The most common strategy to apply reverse vaccinology is by designing subunit recombinant vaccines, which usually generate an humoral immune response due to B-cell epitopes in proteins. A major problem for this strategy is the identification of protective immunogenic proteins from the surfome of the pathogen. Epitope mimicry may lead to auto-immune phenomena related to several human diseases. A sequence-based computational analysis has been carried out applying the BLASTP algorithm. Therefore, two huge databases have been created, one with the most complete and current linear B-cell epitopes, and the other one with the surface-protein sequences of the main human respiratory bacterial pathogens. We found that none of the 7353 linear B-cell epitopes analysed shares any sequence identity region with human proteins capable of generating antibodies, and that only 1% of the 2175 exposed proteins analysed contain a stretch of shared sequence with the human proteome. These findings suggest the existence of a mechanism to avoid autoimmunity. We also propose a strategy for corroborating or warning about the viability of a protein linear B-cell epitope as a putative vaccine candidate in a reverse vaccinology study; so, epitopes without any sequence identity with human proteins should be very good vaccine candidates, and the other way around
    corecore